58 research outputs found

    Paying attention to cardiac surgical risk: An interpretable machine learning approach using an uncertainty-aware attentive neural network

    Get PDF
    Machine learning (ML) is increasingly applied to predict adverse postoperative outcomes in cardiac surgery. Commonly used ML models fail to translate to clinical practice due to absent model explainability, limited uncertainty quantification, and no flexibility to missing data. We aimed to develop and benchmark a novel ML approach, the uncertainty-aware attention network (UAN), to overcome these common limitations. Two Bayesian uncertainty quantification methods were tested, generalized variational inference (GVI) or a posterior network (PN). The UAN models were compared with an ensemble of XGBoost models and a Bayesian logistic regression model (LR) with imputation. The derivation datasets consisted of 153,932 surgery events from the Australian and New Zealand Society of Cardiac and Thoracic Surgeons (ANZSCTS) Cardiac Surgery Database. An external validation consisted of 7343 surgery events which were extracted from the Medical Information Mart for Intensive Care (MIMIC) III critical care dataset. The highest performing model on the external validation dataset was a UAN-GVI with an area under the receiver operating characteristic curve (AUC) of 0.78 (0.01). Model performance improved on high confidence samples with an AUC of 0.81 (0.01). Confidence calibration for aleatoric uncertainty was excellent for all models. Calibration for epistemic uncertainty was more variable, with an ensemble of XGBoost models performing the best with an AUC of 0.84 (0.08). Epistemic uncertainty was improved using the PN approach, compared to GVI. UAN is able to use an interpretable and flexible deep learning approach to provide estimates of model uncertainty alongside stateof- the-art predictions. The model has been made freely available as an easy-to-use web application demonstrating that by designing uncertainty-aware models with innately explainable predictions deep learning may become more suitable for routine clinical use.The ANZSCTS Cardiac Surgery Database Program is funded by the Department of Health (Victoria), the Clinical Excellence Commission (NSW)Queensland Health (QLD)ANZSCTS Database Research activities are supported through a National Health and Medical Research Council Principal Research Fellowship (APP 1136372)Program Grant (APP 1092642

    Tree-based survival analysis improves mortality prediction in cardiac surgery

    Get PDF
    Objectives: Machine learning (ML) classification tools are known to accurately predict many cardiac surgical outcomes. A novel approach, ML-based survival analysis, remains unstudied for predicting mortality after cardiac surgery. We aimed to benchmark performance, as measured by the concordance index (C-index), of tree-based survival models against Cox proportional hazards (CPH) modeling and explore risk factors using the best-performing model. Methods: 144,536 patients with 147,301 surgery events from the Australian and New Zealand Society of Cardiac and Thoracic Surgeons (ANZSCTS) national database were used to train and validate models. Univariate analysis was performed using Student’s T-test for continuous variables, Chi-squared test for categorical variables, and stratified Kaplan-Meier estimation of the survival function. Three ML models were tested, a decision tree (DT), random forest (RF), and gradient boosting machine (GBM). Hyperparameter tuning was performed using a Bayesian search strategy. Performance was assessed using 2-fold cross-validation repeated 5 times. Results: The highest performing model was the GBM with a C-index of 0.803 (0.002), followed by RF with 0.791 (0.003), DT with 0.729 (0.014), and finally CPH with 0.596 (0.042). The 5 most predictive features were age, type of procedure, length of hospital stay, drain output in the first 4 h (ml), and inotrope use greater than 4 h postoperatively. Conclusion: Tree-based learning for survival analysis is a non-parametric and performant alternative to CPH modeling. GBMs offer interpretable modeling of non-linear relationships, promising to expose the most relevant risk factors and uncover new questions to guide future research.The ANZSCTS National Cardiac Surgery Database Program is funded by the Department of Health (Victoria)the Clinical Excellence Commission (NSW)Queensland Health (QLD)Cardiac surgical units participating in the registry. ANZSCTS Database Research activities are supported through a National Health and Medical Research Council Principal Research Fellowship (APP 1136372)Program Grant (APP 1092642

    Efficient Cross-Validation of Echo State Networks

    Full text link
    Echo State Networks (ESNs) are known for their fast and precise one-shot learning of time series. But they often need good hyper-parameter tuning for best performance. For this good validation is key, but usually, a single validation split is used. In this rather practical contribution we suggest several schemes for cross-validating ESNs and introduce an efficient algorithm for implementing them. The component that dominates the time complexity of the already quite fast ESN training remains constant (does not scale up with kk) in our proposed method of doing kk-fold cross-validation. The component that does scale linearly with kk starts dominating only in some not very common situations. Thus in many situations kk-fold cross-validation of ESNs can be done for virtually the same time complexity as a simple single split validation. Space complexity can also remain the same. We also discuss when the proposed validation schemes for ESNs could be beneficial and empirically investigate them on several different real-world datasets.Comment: Accepted in ICANN'19 Workshop on Reservoir Computin

    After the epidemic: Zika virus projections for Latin America and the Caribbean

    Get PDF
    Background: Zika is one of the most challenging emergent vector-borne diseases, yet its future public health impact remains unclear. Zika was of little public health concern until recent reports of its association with congenital syndromes. By 3 August 2017 ~217,000 Zika cases and ~3,400 cases of associated congenital syndrome were reported in Latin America and the Caribbean. Some modelling exercises suggest that Zika virus infection could become endemic in agreement with recent declarations from the The World Health Organisation. Methodology/Principal findings: We produced high-resolution spatially-explicit projections of Zika cases, associated congenital syndromes and monetary costs for Latin America and the Caribbean now that the epidemic phase of the disease appears to be over. In contrast to previous studies which have adopted a modelling approach to map Zika potential, we project case numbers using a statistical approach based upon reported dengue case data as a Zika surrogate. Our results indicate that ~12.3 (0.7–162.3) million Zika cases could be expected across Latin America and the Caribbean every year, leading to ~64.4 (0.2–5159.3) thousand cases of Guillain-Barré syndrome and ~4.7 (0.0–116.3) thousand cases of microcephaly. The economic burden of these neurological sequelae are estimated to be USD ~2.3 (USD 0–159.3) billion per annum. Conclusions/Significance: Zika is likely to have significant public health consequences across Latin America and the Caribbean in years to come. Our projections inform regional and federal health authorities, offering an opportunity to adapt to this public health challenge

    Forecasting: theory and practice

    Get PDF
    Forecasting has always been at the forefront of decision making and planning. The uncertainty that surrounds the future is both exciting and challenging, with individuals and organisations seeking to minimise risks and maximise utilities. The large number of forecasting applications calls for a diverse set of forecasting methods to tackle real-life challenges. This article provides a non-systematic review of the theory and the practice of forecasting. We provide an overview of a wide range of theoretical, state-of-the-art models, methods, principles, and approaches to prepare, produce, organise, and evaluate forecasts. We then demonstrate how such theoretical concepts are applied in a variety of real-life contexts. We do not claim that this review is an exhaustive list of methods and applications. However, we wish that our encyclopedic presentation will offer a point of reference for the rich work that has been undertaken over the last decades, with some key insights for the future of forecasting theory and practice. Given its encyclopedic nature, the intended mode of reading is non-linear. We offer cross-references to allow the readers to navigate through the various topics. We complement the theoretical concepts and applications covered by large lists of free or open-source software implementations and publicly-available databases

    How to evaluate sentiment classifiers for Twitter time-ordered data?

    Get PDF
    Social media are becoming an increasingly important source of information about the public mood regarding issues such as elections, Brexit, stock market, etc. In this paper we focus on sentiment classification of Twitter data. Construction of sentiment classifiers is a standard text mining task, but here we address the question of how to properly evaluate them as there is no settled way to do so. Sentiment classes are ordered and unbalanced, and Twitter produces a stream of time-ordered data. The problem we address concerns the procedures used to obtain reliable estimates of performance measures, and whether the temporal ordering of the training and test data matters. We collected a large set of 1.5 million tweets in 13 European languages. We created 138 sentiment models and out-of-sample datasets, which are used as a gold standard for evaluations. The corresponding 138 in-sample datasets are used to empirically compare six different estimation procedures: three variants of cross-validation, and three variants of sequential validation (where test set always follows the training set). We find no significant difference between the best cross-validation and sequential validation. However, we observe that all cross-validation variants tend to overestimate the performance, while the sequential methods tend to underestimate it. Standard cross-validation with random selection of examples is significantly worse than the blocked cross-validation, and should not be used to evaluate classifiers in time-ordered data scenarios

    Machine Learning Algorithms for Predicting and Risk Profiling of Cardiac Surgery-Associated Acute Kidney Injury

    No full text
    Using a large national database of cardiac surgical procedures, we applied machine learning (ML) to risk stratification and profiling for cardiac surgery-associated acute kidney injury. We compared performance of ML to established scoring tools. Four ML algorithms were used, including logistic regression (LR), gradient boosted machine (GBM), K-nearest neighbor, and neural networks (NN). These were compared to the Cleveland Clinic score, and a risk score developed on the same database. Five-fold cross-validation repeated 20 times was used to measure the area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. Risk profiles from GBM and NN were generated using Shapley additive values. A total of 97,964 surgery events in 96,653 patients were included. For predicting postoperative renal replacement therapy using pre- and intraoperative data, LR, GBM, and NN achieved an AUC (standard deviation) of 0.84 (0.01), 0.85 (0.01), 0.84 (0.01) respectively outperforming the highest performing scoring tool with 0.81 (0.004). For predicting cardiac surgery-associated acute kidney injury, LR, GBM, and NN each achieved 0.77 (0.01), 0.78 (0.01), 0.77 (0.01) respectively outperforming the scoring tool with 0.75 (0.004). Compared to scores and LR, shapely additive values analysis of black box model predictions was able to generate patient-level explanations for each prediction. ML algorithms provide state-of-the-art approaches to risk stratification. Explanatory modeling can exploit complex decision boundaries to aid the clinician in understanding the risks specific to individual patients
    • …
    corecore